DeepSeek says its V4 model will have throughput issues until the second half of the year, until Ascend 950PR supernodes ‘ship at scale’
The hardware-software collaboration between the two Chinese firms underscores the progress China has made in tech self-reliance, a top national priority for Beijing amid US efforts to block its access to advanced semiconductor chips and chipmaking equipment.
During the livestream hosted on Bilibili and WeChat, Huawei engineers outlined the adaptation of its chips and flagship software system Compute Architecture for Neural Networks (CANN) – its equivalent to Nvidia’s dominant CUDA software – with DeepSeek’s latest model.
The company said its entire Ascend SuperNode product line was “fully adapted” to DeepSeek V4 for model inference, which had “significantly improved” due to the two companies’ close collaboration before the model’s release.
Inference refers to the serving and running of an AI model. While cutting-edge Chinese models are still reliant on advanced US chips for training, computational demand for inference was set to surpass training demand globally by 2030, according to McKinsey & Company.
Phelix Lee, senior equity analyst at Morningstar, said the adaptation of Ascend and other domestic chips with DeepSeek V4 reflected the broader trend of Chinese AI model companies switching from foreign chips to domestic replacements.
Besides Huawei, Chinese chip designers Moore Threads and Cambricon Technologies also announced on Friday that their respective chips enjoyed “day zero” adaptation with V4.
Meanwhile, Tencent Holdings was one of the first domestic cloud service providers to make V4 available on its “TokenHub” platform, which it said was powered by chips from multiple domestic chipmakers.
These developments bode well for China’s entire domestic chip supply chain, according to Lee, who added that not only foundries would benefit, but also domestic chip equipment vendors such as SMEE and Naura.
“The biggest [uncertainty] … is whether this increase in domestic AI demand will be profitable in the next two or three years amid elevated capital expenditures and low yields,” he said.
While US restrictions on access to advanced chips from the likes of Nvidia and AMD have slowed Chinese AI model development, they have also galvanised the domestic chip industry to step up to fill the gap.
Though DeepSeek did not disclose whether it trained its latest model on Nvidia chips, as it has done previously, domestic rivals are actively experimenting with using domestic chips for model training as they look to reduce their reliance on US chips.
On Friday, Chinese on-demand services group Meituan invited users to test a new trillion-parameter AI model, which local reports said was trained entirely on domestically produced chips.
Meituan did not immediately respond to a request for comment.
“V4 is simply a follow-through on that same trend, and trends don’t make headlines the way shocks do,” Su said, adding that the “reality” of the competitiveness and cost efficiency of Chinese AI had already been priced in.
Simon Willison, a prominent AI blogger, wrote on Friday that the new DeepSeek model was “the cheapest of the larger frontier models” globally.
In an extended technical report accompanying the new model, DeepSeek said V4 was competitive overall with leading closed-source US models, while being “approximately 3 to 6 months” behind on reasoning capabilities.
Benjamin Pou, Hong Kong-based founder of coding education company KTBYTE, said the rapid improvement of open-source Chinese models in recent months had made it “uneconomical” to use expensive US models for customer-facing services, particularly as the rise of AI agents greatly increased AI token usage.
“We should expect AI agents to be cheap and universal in the future just like web pages and social media are democratised today,” he said.
Additional reporting by Xinmei Shen